Improving Entity Linking in Chinese Domain by Sense Embedding Based on Graph Clustering

نویسندگان

چکیده

Entity linking refers to a string in text corresponding entities knowledge base through candidate entity generation and ranking. It is of great significance some NLP (natural language processing) tasks, such as question answering. Unlike English linking, Chinese requires more consideration due the lack spacing capitalization sequences ambiguity characters words, which evident certain scenarios. In domains, industry, generated are usually composed long strings heavily nested. addition, meanings words that make up industrial sometimes ambiguous. Their semantic space subspace general word embedding space, thus each needs get its exact meanings. Therefore, we propose two schemes achieve better linking. First, implement an n-gram based method increase recall rate reduce nesting noise. Then, enhance ranking mechanism by introducing sense embedding. Considering contradiction between vectors single domain, design model on graph clustering, adopts unsupervised approach for induction learns representation conjunction with context. We test quality our classical datasets demonstrate disambiguation ability confirm can learn entities’ fundamental laws domain performance experiments.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Graph-based Method for Entity Linking

In this paper, we formalize the task of finding a knowledge base entry that a given named entity mention refers to, namely entity linking, by identifying the most “important” node among the graph nodes representing the candidate entries. With the aim of ranking these entities by their “importance”, we introduce three degree-based measures of graph connectivity. Experimental results on the TACKB...

متن کامل

Graph-based Entity Linking using Shortest Path

Entity Linking (EL) is a technique to link named entities in a given context to relevant entities in a given knowledge base. Generally, EL consists of two important task. But, there are limitations of these tasks. To overcome these limitations, we tried to solve the problem of EL through the interdependencies between not only named entities but also common nouns and verbs appearing in the conte...

متن کامل

Graph-Based Named Entity Linking with Wikipedia

Named entity linking (NEL) grounds entity mentions to their corresponding Wikipedia article. State-of-the-art supervised NEL systems use features over the rich Wikipedia document and link-graph structure. Graph-based measures have been effective over WordNet for word sense disambiguation (WSD). We draw parallels between NEL and WSD, motivating our unsupervised NEL approach that exploits the Wik...

متن کامل

Clustering based on random graph model embedding vertex features

Large datasets with interactions between objects are common to numerous scientific fields (i.e. social science, internet, biology. . . ). The interactions naturally define a graph and a common way to explore or summarize such dataset is graph clustering. Most techniques for clustering graph vertices just use the topology of connections ignoring informations in the vertices features. In this pap...

متن کامل

Graph Classification and Clustering Based on Vector Space Embedding

graph classification and clustering based on vector space embedding. Book lovers, when you need a new book to read, find the book here. Never worry not to find what you need. Is the graph classification and clustering based on vector space embedding your needed book now? That's true; you are really a good reader. This is a perfect book that comes from great author to share with you. The book of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Computer Science and Technology

سال: 2023

ISSN: ['1666-6046', '1666-6038']

DOI: https://doi.org/10.1007/s11390-023-2835-4